feat: track allocations per fleet#4513
Conversation
5dea128 to
38ae676
Compare
|
/gcbrun |
|
Build Failed 😭 Build Id: 0200e9ed-b8cb-4f9c-82a3-22a40ca1032e Status: FAILURE To get permission to view the Cloud Build view, join the agones-discuss Google Group. |
|
/gcbrun |
|
Build Succeeded 🥳 Build Id: 52edf034-8467-427b-90f9-765fd733d9c6 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version: |
markmandel
left a comment
There was a problem hiding this comment.
Thanks for doing this -- sorry it took a bit to get to it.
We'll need some docs at https://agones.dev/site/docs/guides/metrics/ -- but also will need to be feature shortcoded -- https://agones.dev/site/docs/contribute/documentation-editing-contribution/
| gs.Status.State = agonesv1.GameServerStateAllocated | ||
| ctrl.gsWatch.Modify(gs) | ||
|
|
||
| require.Eventually(t, func() bool { |
There was a problem hiding this comment.
If you use https://pkg.go.dev/github.com/stretchr/testify@v1.11.1/require#EventuallyWithT you can check multiple assertions, and also get better error messages.
| ReservedReplicas int32 `json:"reservedReplicas"` | ||
| // AllocatedReplicas are the number of Allocated GameServer replicas | ||
| AllocatedReplicas int32 `json:"allocatedReplicas"` | ||
| // Allocations is a counter of the number of allocations observed. |
There was a problem hiding this comment.
New docs will need feature shortcodes please:
https://agones.dev/site/docs/contribute/documentation-editing-contribution/
| defer cancel() | ||
|
|
||
| c.allocsMu.Lock() | ||
| defer c.allocsMu.Unlock() |
There was a problem hiding this comment.
Nice one on making sure it gets processed on shutdown.
Probably doesn't make a huge difference, but holding the whole mutex over the network calls may cause some interesting long term locking over shutdown. Might be worth taking a snapshot - or the very least, moving to a RWLock, to avoid some potential contention.
| count, ok := c.allocs[key] | ||
| if !ok { | ||
| c.allocs[key] = 1 | ||
| return | ||
| } | ||
| c.allocs[key] = count + 1 |
There was a problem hiding this comment.
| count, ok := c.allocs[key] | |
| if !ok { | |
| c.allocs[key] = 1 | |
| return | |
| } | |
| c.allocs[key] = count + 1 | |
| c.allocs[key]++ |
maps will 0 value here, just as a suggestion
|
/gcbrun |
|
Build Succeeded 🥳 Build Id: 6178bfd6-30e7-4060-ba43-0e3864f400b8 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version: |
|
/gcbrun |
1 similar comment
|
/gcbrun |
|
Build Succeeded 🥳 Build Id: c09603eb-c5f9-4bac-84cf-41668d315953 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version: |
Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>
Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>
Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>
Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>
Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>
Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>
Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>
Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>
Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>
Signed-off-by: Nicholas Wiersma <nick@wiersma.co.za>
|
/gcbrun Sorry - I missed this one for a while! |
|
No stress. When you happy to merge, ill update the versions again in the docs. |
|
Build Succeeded 🥳 Build Id: dc7a3c1b-932b-4995-ba4f-5f1ce0238653 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version: |
markmandel
left a comment
There was a problem hiding this comment.
Looks good! Just need to update the version numbers in the docs, and this is good to go!
Thanks!
|
|
||
|
|
||
|
|
||
| {{% feature expiryVersion="1.58.0" %}} |
There was a problem hiding this comment.
Sorry to make you do this, but yeah - update the version to 1.59
What type of PR is this?
What this PR does / Why we need it:
This PR adds allocations tracking on a Fleet level to provide real allocation metrics to the Fleet Autoscaler. This provides better metrics to base custom autoscaling off of while also providing a good metric to the system
Which issue(s) this PR fixes:
Closes #4452
Special notes for your reviewer:
Initially I thought to use a
RWMutexandatomic.Int64, but the locking logic got really verbose and I dont think, even in active clusters, that the contention on aMutexis likely to be a real issue, given how low the actual lock time will be. If this proves to no longer be true, it is simple to change it.It also occurred to me that queuing the Fleet when an Allocation is observed is likely not needed, as this will always result in either a change in AllocatedReplicas, ReadyReplicas or Replicas on the
GameServerSetwhich will itself queue theFleet. I will run some manual tests to prove this.